Dimensionality Reduction via Genetic Value Clustering

نویسندگان

  • Alexander P. Topchy
  • William F. Punch
چکیده

Feature extraction based on evolutionary search offers new possibilities for improving classification accuracy and reducing measurement complexity in many data mining and machine learning applications. We present a family of genetic algorithms for feature synthesis through clustering of discrete attribute values. The approach uses new compact graph-based encoding for cluster representation, where size of GA search space is reduced exponentially with respect to the number of items in partitioning, as compared to original idea of Park and Song. We apply developed algorithms and study their effectiveness for DNA fingerprinting in population genetics and text categorization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dimension Reduction of Health Data Clustering

The current data tends to be more complex than conventional data and need dimension reduction. Dimension reduction is important in cluster analysis and creates a smaller data in volume and has the same analytical results as the original representation. A clustering process needs data reduction to obtain an efficient processing time while clustering and mitigate curse of dimensionality. This pap...

متن کامل

Randomized Dimensionality Reduction for k-Means Clustering

We study the topic of dimensionality reduction for k-means clustering. Dimensionality reduction encompasses the union of two approaches: 1) feature selection and 2) feature extraction. A feature selection-based algorithm for k-means clustering selects a small subset of the input features and then applies k-means clustering on the selected features. A feature extraction-based algorithm for k-mea...

متن کامل

Dimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)

This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...

متن کامل

Feature Selection based Semi-Supervised Subspace Clustering

Clustering is the process which is used to assign a set of n objects into clusters(groups). Dimensionality reduction techniques help in increasing the accuracy of clustering results by removing redundant and irrelevant dimensions. But, in most of the situations, objects can be related in different ways in different subsets of the dimensions. Dimensionality reduction tends to get rid of such rel...

متن کامل

A comparative study of dimensionality reduction techniques to enhance trace clustering performances

Process mining techniques have been used to analyze event logs from information systems in order to derive useful patterns. However, in the big data era, real-life event logs are huge, unstructured, and complex so that traditional process mining techniques have difficulties in the analysis of big logs. To reduce the complexity during the analysis, trace clustering can be used to group similar t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003